Discovering Word Senses for Polysemous Words Using Feature Domain Similarity

نویسندگان

  • Noriko Tomuro
  • Kyoko Kanzaki
  • Hitoshi Isahara
چکیده

This paper presents a new clustering algorithm called DSCBC which is designed to automatically discover word senses for polysemous words. DSCBC is an extension of CBC (Pantel and Lin, 2002), and incorporates feature domain similarity: the similarity between the features themselves, obtained a priori from sources external to the dataset. By incorporating the feature domain similarity in clustering, DSCBC produces monosemous clusters (a cluster in one domain), thereby discovering individual senses of polysemous words. For evaluation, we apply the algorithm to Japanese and English adjectives, and compare the derived senses against manually created lexicons. The results show significant improvements over other clustering algorithms including CBC.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paper has been my ruin : conceptual relations of polysemous senses q

Polysemous words have different but related meanings (senses), such as paper meaning a newspaper or writing material. Six experiments examined the similarity of word senses using categorization and inference tasks. The experiments found that subjects did not categorize together phrases that used a polysemous word in different senses, though they did when the word was used in the same sense. Dif...

متن کامل

Learning Similarity-based Word Sense Disambiguation from Sparse Data

We describe a method for automatic word sense disambiguation using a text corpus and a machine-readable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this definition is resolved by an iterative, converging process, in...

متن کامل

Similarity-based Word Sense Disambiguation

We describe a method for automatic word sense disambiguation using a text corpus and a machinereadable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this definition is resolved by an iterative, converging process, in ...

متن کامل

Making Sense of Word Sense Variation

We present a pilot study of word-sense annotation using multiple annotators, relatively polysemous words, and a heterogenous corpus. Annotators selected senses for words in context, using an annotation interface that presented WordNet senses. Interannotator agreement (IA) results show that annotators agree well or not, depending primarily on the individual words and their general usage properti...

متن کامل

Polysemy in Sentence Comprehension: Effects of Meaning Dominance.

Words like church are polysemous, having two related senses (a building and an organization). Three experiments investigated how polysemous senses are represented and processed during sentence comprehension. On one view, readers retrieve an underspecified, core meaning, which is later specified more fully with contextual information. On another view, readers retrieve one or more specific senses...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007